A Thread Tranquilizer: Dynamically Reducing Performance Variation

نویسندگان

  • Kishore Kumar Pusukuri
  • Rajiv Gupta
  • Laxmi N. Bhuyan
چکیده

To realize the performance potential of multicore systems, we must effectively manage the interactions between memory reference behaviour and the operating system policies for thread scheduling and migration decisions. We observe that these interactions lead to significant variations in the performance of a given application, from one execution to the next, even when the program input remains unchanged and no other applications are being run on the system. Our experiments with multithreaded programs, including the TATP database application, SPECjbb2005, and a subset of PARSEC and SPEC OMP programs, on a 24-core Dell PowerEdge R905 server running OpenSolaris confirms the above observation. In this work we develop Thread Tranquilizer, an automatic technique for simultaneously reducing performance variation and improving performance by dynamically choosing appropriate memory allocation and process scheduling policies. Thread Tranquilizer uses simple utilities available on modern Operating Systems for monitoring cache misses and thread context-switches and then utilizes the collected information to dynamically select appropriate memory allocation and scheduling policies. In our experiments, Thread Tranquilizer yields up to 98% (average 68%) reduction in performance variation and up to 43% (average 15%) improvement in performance over default policies of OpenSolaris. We also demonstrate that Thread Tranquilizer simultaneously reduces performance variation and improves performance of the programs on Linux. Thread Tranquilizer is easy to use as it does not require any changes to the application source code or the OS kernel.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Adaptive Performance Optimization under Power Constraint in Multi-thread Applications with Diverse Scalability

In modern data centers, energy usage represents one of the major factors affecting operational costs. Power capping is a technique that limits the power consumption of individual systems, which allows reducing the overall power demand at both cluster and data center levels. However, literature power capping approaches do not fit well the nature of important applications based on first-class mul...

متن کامل

THESIS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY Techniques to Reduce Thread-Level Speculation Overhead

The traditional single-core processors are being replaced by chip multiprocessors (CMPs) where several processor cores are integrated on a single chip. While this is beneficial for multithreaded applications and multiprogrammed workloads, CMPs do not provide performance improvements for single-threaded applications. Threadlevel speculation (TLS) has been proposed as a way to improve single-thre...

متن کامل

Performance Study and Dynamic Optimization Design for Thread Pool Systems

Thread pools have been widely used by many multithreaded applications. However, the determination of the pool size according to the application behavior still remains problematic. To automate this process, in this paper we have developed a set of performance metrics for quantitatively analyzing thread pool performance. For our experiments, we built a thread pool system which provides a general ...

متن کامل

Dynamic Thread Pinning for Phase-Based OpenMP Programs

Thread affinity has appeared as an important technique to improve the overall program performance and for better performance stability. However, if we consider a program with multiple phases, it is unlikely that a single thread affinity produces the best program performance for all these phases. If we consider the case of OpenMP, applications may have multiple parallel regions, each with a dist...

متن کامل

A High Performance Adaptive Miss Handling Architecture for Chip Multiprocessors

Chip Multiprocessors (CMPs) mainly base their performance gains on exploiting thread-level parallelism. Consequently, powerful memory systems are needed to support an increasing number of concurrent threads. Conventional CMP memory systems do not account for thread interference which can result in reduced overall system performance. Therefore, conventional high bandwidth Miss Handling Architect...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012